MCDC Virtual Workshop on
Teaching Introductory Machine Learning
2023-11-01
Where does model evaluation and selection come up during an introductory machine learning course?
Early on, maybe first day — How do we know if a model is any good?
During model building/training — How to pick the “best” model?
Discussion of final model — How will the final model preform?
Data spending/allocation, performance vs comparisons, overfitting, bias-variance trade-off, metrics, & explaining models
A schematic for the typical modeling process from Tidymodeling with R
Main objective is prediction!
Take a set of predictors/features \(X\) and use them to predict an outcome/target \(Y\)
\[\widehat{Y} = \widehat{f}(X)\] This assumes there exists an \(f()\) such that \(Y = f(X) + \epsilon\) … the truth!
Assess the residuals/errors!
\[Y - \widehat{Y} = f(X) - \widehat{f}(X) + \epsilon\]
\[E[(Y - \widehat{Y})^2] = E[(f(X) - \widehat{f}(X))^2] + \mbox{var}(\epsilon)\]
Reducible error vs irreducible error
Overfitting
Training vs Testing (Data spending/ allocation)
Avoiding overfitting & the goal is out of sample performance estimate
A schematic for the typical modeling process from Tidymodeling with R
Pick an evaluation metric for model comparisons!
Don’t always have to go with best performing model
1 standard error rule: Are the competing models really preforming differently?
Occam’s Razor or KISS
Using the test data
Not restricted to metric used for comparison!
Might use metrics that are easier to explain
Be careful of causal interpretations